SQL Practice Hub

Sharpen relational reasoning: set-based querying, joins, aggregation, windows, normalization, indexing & transaction semantics.

DDLDMLJoinsWindowIndexTxn

1. Relational Model & Normalization

Data as relations (tables) with keys ensuring integrity.

Keys

Primary key uniquely identifies row. Foreign key enforces referential link.

Integrity

Normalization

1NF atomic, 2NF full dependency, 3NF remove transitive, BCNF stronger determinants.

Denormalization

Performance trade: duplicates for fewer joins; maintain with routines or views.

Constraints

NOT NULL, UNIQUE, CHECK, DEFAULT, FOREIGN KEY; enforce invariants at storage.

Sample Schema

CREATE TABLE users(
  id SERIAL PRIMARY KEY,
  email TEXT UNIQUE NOT NULL,
  created_at TIMESTAMPTZ DEFAULT now()
);
CREATE TABLE orders(
  id SERIAL PRIMARY KEY,
  user_id INT REFERENCES users(id),
  total_cents INT CHECK(total_cents >= 0)
);

2. Core Querying & Joins

Think in sets; express desired result not loops.

JOIN Types

  • INNER: intersection
  • LEFT: preserve left
  • RIGHT: preserve right
  • FULL: union w/ nulls
  • CROSS: cartesian

Filtering & Grouping

  • WHERE filters rows before grouping
  • HAVING filters groups after aggregate
  • ORDER BY after SELECT projection

Window Functions

  • ROW_NUMBER(), RANK()
  • PARTITION BY groups logical window
  • Frame clauses refine subset

Set Ops

  • UNION (distinct)
  • UNION ALL (no dedupe)
  • INTERSECT, EXCEPT

JOIN Example

SELECT u.email, o.total_cents
FROM users u
JOIN orders o ON o.user_id = u.id
WHERE o.total_cents > 5000
ORDER BY o.total_cents DESC;

3. Aggregation, Windows & Transactions

Summarize data then layer analytic context & consistency.

Aggregation

  • COUNT(*), SUM(col), AVG()
  • GROUP BY keys define grain
  • Distinct counts expensive; sample+estimate

Windows

  • SUM(val) OVER (PARTITION BY k ORDER BY t ROWS 3 PRECEDING)
  • Running totals & moving averages

Transactions

  • BEGIN / COMMIT / ROLLBACK
  • Isolation: Read Uncommitted → Serializable
  • ACID: atomicity, consistency, isolation, durability

Locking

  • Row vs table locks
  • MVCC reduces read contention
  • Deadlock detection & retry

Window Example

SELECT user_id,
       total_cents,
       SUM(total_cents) OVER (PARTITION BY user_id ORDER BY id ROWS BETWEEN 2 PRECEDING AND CURRENT ROW) AS rolling_3
FROM orders;

4. Indexing & Execution Plans

Accelerate lookups, maintain write cost awareness.

Index Types

  • B-Tree (range, equality)
  • Hash (equality only)
  • GIN / GiST (full-text, arrays)

Selectivity

  • High selectivity = few rows = better candidate
  • Composite index order matters (left prefix)

Execution Plans

  • EXPLAIN / EXPLAIN ANALYZE
  • Seq Scan vs Index Scan
  • Filter vs Index Condition

Anti-Patterns

  • Over-index (write penalty)
  • Functions on indexed column (unless expr index)
  • Leading wildcard LIKE (no use of index)

Index Example

CREATE INDEX idx_orders_user_total ON orders(user_id,total_cents);
EXPLAIN ANALYZE SELECT * FROM orders WHERE user_id=5 AND total_cents>1000;

5. Interactive Lab

Client-side simulations only.

Query Sandbox (Mock)

(result)

JOIN Visualizer

(join)

Normalization Analyzer

(analysis)

Index Advisor

(advice)

6. Query Patterns Cheat Sheet

Reusable structural templates.

Top-N Per Group

SELECT * FROM (
  SELECT *, ROW_NUMBER() OVER(
    PARTITION BY user_id ORDER BY total_cents DESC) rn
  FROM orders
) t WHERE rn <= 3;

Upsert (Postgres)

INSERT INTO users(id,email)
VALUES(1,'a@x')
ON CONFLICT (id)
DO UPDATE SET email=EXCLUDED.email;

Pivot via FILTER

SELECT
  user_id,
  SUM(total_cents) FILTER (WHERE total_cents > 10000) AS high,
  SUM(total_cents) FILTER (WHERE total_cents <= 10000) AS low
FROM orders GROUP BY 1;

Running Total

SELECT id,total_cents,
 SUM(total_cents) OVER (ORDER BY id) run
FROM orders;

Gap Detection

SELECT id, prev_id, id-prev_id-1 AS gap
FROM (
  SELECT id, LAG(id) OVER(ORDER BY id) prev_id
  FROM orders
) t WHERE id-prev_id > 1;

Percentile

SELECT percentile_cont(0.9)
WITHIN GROUP (ORDER BY total_cents)
FROM orders;

7. Review & Mastery

Checklist & conceptual Q&A.

Progress Checklist

Concept Q&A

Why window over self-join for running totals?

Windows scan once with frame semantics; self-joins duplicate row counts & increase complexity.

Primary vs unique key?

PRIMARY KEY = NOT NULL + UNIQUE plus singular table identity; multiple UNIQUE constraints allowed.

High cardinality index issues?

Generally beneficial for selectivity; downside is write amplification & larger index memory footprint.

Why analyze execution plans?

Reveal access paths (seq vs index), join algorithms, row estimates; mismatches hint statistics problems.